Goto

Collaborating Authors

 entire internet


The Entire Internet Is Reverting to Beta

The Atlantic - Technology

A car that accelerates instead of braking every once in a while is not ready for the road. A faucet that occasionally spits out boiling water instead of cold does not belong in your home. Working properly most of the time simply isn't good enough for technologies that people are heavily reliant upon. And two and a half years after the launch of ChatGPT, generative AI is becoming such a technology. Even without actively seeking out a chatbot, billions of people are now pushed to interact with AI when searching the web, checking their email, using social media, and online shopping.


Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

Li, Jonathan, Bhambhoria, Rohan, Dahan, Samuel, Zhu, Xiaodan

arXiv.org Artificial Intelligence

Generative AI models, such as the GPT and Llama series, have significant potential to assist laypeople in answering legal questions. However, little prior work focuses on the data sourcing, inference, and evaluation of these models in the context of laypersons. To this end, we propose a human-centric legal NLP pipeline, covering data sourcing, inference, and evaluation. We introduce and release a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law, corresponding answers written by legal experts, and citations for each answer. We develop an automatic evaluation protocol for this dataset, then show that retrieval-augmented generation from only 850 citations in the train set can match or outperform internet-wide retrieval, despite containing 9 orders of magnitude less data. Finally, we propose future directions for open-sourced efforts, which fall behind closed-sourced models.


A.I. Is Sucking the Entire Internet In. What If You Could Yank Some of It Back Out?

Slate

A.I. image generators are divisive. But few can deny that they have gotten really good. Within seconds, you can type in a prompt to make a photorealistic image of Donald Trump getting arrested or turn your strangest idea into something tangible. Over the coming years, A.I. companies will release even more advanced models that will remind us that this is just the beginning. At least one of these tools will be different in an important way: It will be prohibited from seeing 80 million of the images that helped teach its predecessors to draw and paint.


Experts Say That Soon, Almost the Entire Internet Could Be Generated by AI

#artificialintelligence

The Internet of the future could be written by bots, but will that make it better or worse? Experts at the Copenhagen Institute for Future Studies (CIFS) are raising questions about AI-generated content, and how it could come to dominate the metaverse and other digital locations. CIFS expert Timothy Shoup estimates that 99 percent to 99.9 percent of the internet's content will be AI-generated by 2025 to 2030, especially if models like OpenAI's GPT-3 achieve wider adoption. "The internet would be completely unrecognizable," Shoup told colleague Sofie Hvitved. As its capabilities advance, the idea is that AI could start to generate entire online worlds, along with all the stuff that inhabits them -- not to mention all the online material that's currently mostly made by humans.


log4j: Tech companies scramble to fix software vulnerability that 'threatens entire internet'

The Independent - Tech

Tech companies across the world are under pressure to fix a software vulnerability that many cybersecurity experts are calling one of the worst to be discovered in recent years. The vulnerability, known as Log4shell, was identified in Apache's Log4j software library that helps developers keep track of changes in the applications they build. The software flaw was first noticed on sites catering to the popular video game Minecraft, and was officially reported to Apache on 24 November by Chen Zhaojun of Alibaba, according to Crowdstrike. But it soon became clear that the vulnerability had far-reaching implications since the software is ubiquitous, used in millions of applications across the internet, including Amazon Web Services, Apple's iCloud, and the video game distribution service Steam. Experts say the vulnerability can allow hackers to control java-based web servers and enable them to execute remote code execution (RCE) attacks, which they may use to take control of affected systems.


Log4j software bug is 'severe risk' to the entire internet

New Scientist

A major security flaw has been discovered in a piece of software called Log4j, which is used by millions of web servers. The bug leaves them vulnerable to attack, and teams around the world are scrambling to patch affected systems before hackers can exploit them. "The internet's on fire right now," said Adam Meyers at security company Crowdstrike. The problem with Log4j was first noticed in the video game Minecraft but it quickly became apparent that its impact was far larger. The software is used in millions of web applications, including Apple's iCloud.


Supercomputer analyzes web traffic across entire internet

#artificialintelligence

Using a supercomputing system, MIT researchers have developed a model that captures what web traffic looks like around the world on a given day, which can be used as a measurement tool for internet research and many other applications. Understanding web traffic patterns at such a large scale, the researchers say, is useful for informing internet policy, identifying and preventing outages, defending against cyberattacks, and designing more efficient computing infrastructure. A paper describing the approach was presented at the recent IEEE High Performance Extreme Computing Conference. For their work, the researchers gathered the largest publicly available internet traffic dataset, comprising 50 billion data packets exchanged in different locations across the globe over a period of several years. They ran the data through a novel "neural network" pipeline operating across 10,000 processors of the MIT SuperCloud, a system that combines computing resources from the MIT Lincoln Laboratory and across the Institute.